LILA: Cellular Telephone Speech Databases from Asia

نویسندگان

  • Eric Sanders
  • Asunción Moreno
  • Herbert S. Tropf
  • Lynette Melnar
  • Nurit Dekel
  • Breanna Gillies
  • Niklas Paulsson
چکیده

The goal of the LILA project was the collection of speech databases over cellular telephone networks of five languages in three Asian countries. Three languages were recorded in India: Hindi by first language speakers, Hindi by second language speakers and Indian English. Furthermore, Mandarin was recorded in China and Korean in South-Korea. The databases are part of the SpeechDat-family and follow the SpeechDat rules in many respects. All databases have been finished and have passed the validation tests. Both Hindi databases and the Korean database will be available to the public for sale.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Collection of SLR in the Asian-Pacific Area

The goal of this project (LILA) is the collection of a large number of spoken databases for training Automatic Speech Recognition Systems for telephone applications in the Asian Pacific area. Specifications follow those of SpeechDat-like databases. Utterances will be recorded directly from calls made either from fixed or cellular telephones and are composed by read text and answers to specific ...

متن کامل

MAT-2000 - design, collection, and validation of a Mandarin 2000-speaker telephone speech database

Mandarin speech data Across Taiwan (MAT) is a project initiated by members of the Association for Computational Linguistics and Chinese Language Processing (ACLCLP) to collect speech data through public telephone networks in Taiwan. Totally over 7000 Taiwanese individuals have provided speech data. The results were released as a series of MAT speech databases to the research community in Taiwan...

متن کامل

Building speech databases for cellular networks

The number of telephone applications that use automatic speech recognition is increasing fast. At the same time the use of mobile telephones is rising at high speed. This causes a need for databases with speech recorded over the cellular network. When creating a mobile speech database a number of problems show up that are not an issue when creating a speech database of fixed network recordings....

متن کامل

Towards robust speech recognition in the telephony network environment - cellular and landline conditions

We describe several speaker-independent speech recognition studies conducted with both landline and cellular network telephone data. The cellular environment included the three dominant standards found in the United States: CDMA, TDMA and GSM. Our goal was to design a system that operated over all these four channels, handling their innate variations, such as those of background and line charac...

متن کامل

Cellular-phone based speech-to-speech translation system ATR-MATRIX

We describe the implementation of a cellular-phone based speech translation system without telephone quality speech database or special CT hardware. The purpose is to quickly build a prototype service system that can be used for data collection with real users. To train the acoustic model for the speech recognition system, available high-quality databases were made usable by 1.) appropriate dow...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008